Search CORE

313 research outputs found

On the Consistency of Ordinal Regression Methods

Author: Bach Francis
Gramfort Alexandre
Pedregosa Fabian
Publication venue
Publication date: 01/01/2017
Field of study

Many of the ordinal regression models that have been proposed in the literature can be seen as methods that minimize a convex surrogate of the zero-one, absolute, or squared loss functions. A key property that allows to study the statistical implications of such approximations is that of Fisher consistency. Fisher consistency is a desirable property for surrogate loss functions and implies that in the population setting, i.e., if the probability distribution that generates the data were available, then optimization of the surrogate would yield the best possible model. In this paper we will characterize the Fisher consistency of a rich family of surrogate loss functions used in the context of ordinal regression, including support vector ordinal regression, ORBoosting and least absolute deviation. We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classification. We also derive excess risk bounds for a surrogate of the absolute error that generalize existing risk bounds for binary classification. Finally, our analysis suggests a novel surrogate of the squared error loss. We compare this novel surrogate with competing approaches on 9 different datasets. Our method shows to be highly competitive in practice, outperforming the least squares loss on 7 out of 9 datasets.Comment: Journal of Machine Learning Research 18 (2017

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-CEA

Calibration of One-Class SVM for MV set estimation

Author: Feuillard Vincent
Gramfort Alexandre
Thomas Albert
Publication venue
Publication date: 30/08/2015
Field of study

A general approach for anomaly detection or novelty detection consists in estimating high density regions or Minimum Volume (MV) sets. The One-Class Support Vector Machine (OCSVM) is a state-of-the-art algorithm for estimating such regions from high dimensional data. Yet it suffers from practical limitations. When applied to a limited number of samples it can lead to poor performance even when picking the best hyperparameters. Moreover the solution of OCSVM is very sensitive to the selection of hyperparameters which makes it hard to optimize in an unsupervised setting. We present a new approach to estimate MV sets using the OCSVM with a different choice of the parameter controlling the proportion of outliers. The solution function of the OCSVM is learnt on a training set and the desired probability mass is obtained by adjusting the offset on a test set to prevent overfitting. Models learnt on different train/test splits are then aggregated to reduce the variance induced by such random splits. Our approach makes it possible to tune the hyperparameters automatically and obtain nested set estimates. Experimental results show that our approach outperforms the standard OCSVM formulation while suffering less from the curse of dimensionality than kernel density estimates. Results on actual data sets are also presented.Comment: IEEE DSAA' 2015, Oct 2015, Paris, Franc

arXiv.org e-Print Archive

Crossref

A priori par normes mixtes pour les problèmes inverses Application à la localisation de sources en M/EEG

Author: Gramfort Alexandre
Kowalski Matthieu
Publication venue: HAL CCSD
Publication date: 08/09/2009
Field of study

National audienceOn s'intéresse aux problèmes inverses sous déterminés, et plus particulièrement à la localisation de sources en magnéto et électro- encéphalographie (M/EEG). Dans ces problèmes, bien que l'on ait à disposition un modèle physique de la diffusion (ou du “mélange”) des sources, le caractère très sous-déterminé des problèmes rend l'inversion très difficile. La nécessité de trouver des a priori forts et pertinent physiquement sur les sources est une des parties difficiles de ce problème.Dans ces problèmes, la parcimonie classique mesurée par une norme l1 n'est pas suffisante, et donne des résultats non réalistes. On propose ici de prendre en compte une parcimonie structurée grâce à l'utilisation de normes mixtes, notamment d'une norme mixte sur trois niveaux. La méthode est utilisée sur des signaux MEG issus d'expériences de stimulation somesthésique. Lorsqu'ils sont stimulés, les différents doigts de la main activent des régions distinctes du cortex sensoriel primaire. L'utilisation d'une norme mixte à trois niveaux permet d'injecter cet a priori dans le problème inverse et ainsi de retrouver la bonne organisation corticale des zones actives. Nous montrons également que les méthodes classiquement utilisées dans le domaine échouent dans cette tâche

HAL-CentraleSupelec

HAL AMU

INRIA a CCSD electronic archive server

HAL-CEA

HAL-Ecole des Ponts ParisTech

HAL-Rennes 1

Improving M/EEG source localization with an inter-condition sparse prior

Author: Gramfort Alexandre
Kowalski Matthieu
Publication venue: HAL CCSD
Publication date: 28/06/2009
Field of study

International audienceThe inverse problem with distributed dipoles models in M/EEG is strongly ill-posed requiring to set priors on the solution. Most common priors are based on a convenient

\ell_2

norm. However such methods are known to smear the estimated distribution of cortical currents. In order to provide sparser solutions, other norms than

\ell_2

have been proposed in the literature, but they often do not pass the test of real data. Here we propose to perform the inverse problem on multiple experimental conditions simultaneously and to constrain the corresponding active regions to be different, while preserving the robust

\ell_2

prior over space and time. This approach is based on a mixed norm that sets a

\ell_1

prior between conditions. The optimization is performed with an efficient iterative algorithm able to handle highly sampled distributed models. The method is evaluated on two synthetic datasets reproducing the organization of the primary somatosensory cortex (S1) and the primary visual cortex (V1), and validated with MEG somatosensory data

Crossref

HAL AMU

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

HRF estimation improves sensitivity of fMRI encoding and decoding models

Author: Eickenberg Michael
Gramfort Alexandre
Pedregosa Fabian
Thirion Bertrand
Publication venue
Publication date: 13/05/2013
Field of study

Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal. The general linear model (GLM) allows to estimate the activation from a design matrix and a fixed hemodynamic response function (HRF). However, the HRF is known to vary substantially between subjects and brain regions. In this paper, we propose a model for jointly estimating the hemodynamic response function (HRF) and the activation patterns via a low-rank representation of task effects.This model is based on the linearity assumption behind the GLM and can be computed using standard gradient-based solvers. We use the activation patterns computed by our model as input data for encoding and decoding studies and report performance improvement in both settings.Comment: 3nd International Workshop on Pattern Recognition in NeuroImaging (2013

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-CEA

GAP Safe screening rules for sparse multi-task and multi-class models

Author: Fercoq Olivier
Gramfort Alexandre
Ndiaye Eugene
Salmon Joseph
Publication venue
Publication date: 18/11/2015
Field of study

High dimensional regression benefits from sparsity promoting regularizations. Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be \emph{safe}. In this paper we derive new safe rules for generalized linear models regularized with

\ell_1

and

\ell_1/\ell_2

norms. The rules are based on duality gap computations and spherical safe regions whose diameters converge to zero. This allows to discard safely more variables, in particular for low regularization parameters. The GAP Safe rule can cope with any iterative solver and we illustrate its performance on coordinate descent for multi-task Lasso, binary and multinomial logistic regression, demonstrating significant speed ups on all tested datasets with respect to previous safe rules.Comment: in Proceedings of the 29-th Conference on Neural Information Processing Systems (NIPS), 201

arXiv.org e-Print Archive